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REMARKS 

By the present response, claims 1 , 7, 1 1 and 1 9 have been amended, and claims 
4-5, 9-10, 16-17 and 2 1-22 have been canceled. Thus, after the present amendment, 
claims 1-3, 6-8, 11-15 and 18-20 remain in the present application; Reconsideration and 
allowance of outstanding claims 1-3, 6-8, 11-15 and 18-20 in view of the above 
amendments and the following remarks are respectfully requested. 

A. Rejections of Claims 1-22 under 35 USC 5102(e) 

The Examiner has rejected claims 1-22 under 35 USC § 102(e) as being anticipated 
by U.S. Patent Application Publication Number US 2002/0042909 A 1 to Van Gageldonk, 
et al. ("Van Gageldonk"). For the reasons discussed below, Applicants respectfully 
submit that the present invention, as defined by independent claims 1 , 7, 1 1 , and 19, is 
patentably distinguishable over Van Gageldonk. 

As disclosed in the present application, conventional approaches in the processor 
architecture field do not adequately address the problem of consumption of chip area for 
wide buses, such as wide "move" buses linking various register file banks. Various 
embodiments according to the present invention address and overcome the need in the art 
for speeding up the very long instruction word ("VLIW") processor architecture, reducing 
power consumption, and reducing chip area while accommodating multiple register file 
banks and multiple execution units. 
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Embodiments according to the present invention, as shown in Figure 2 of the 
present application, include first and second register file banks. The first register file 
bank comprises a first plurality of read ports, and the second register file bank comprises 
a second plurality of read ports. A first data path block comprises a first plurality of 
execution units, and a second data path block comprises a second plurality of execution 
units. A first plurality of buses couple the first plurality of read ports to each of the first 
and second data path blocks. A second plurality of buses couple the second plurality of 
read ports to each of the first and second data path blocks. An operand residing in the 
first plurality of read ports is concurrently accessed by the first plurality of execution units 
in the first data path block and by the second plurality of execution units in the second 
data path block. 

In a conventional VLIW processor there are "move" buses that cany source 
operands from one register file bank to the other. This architecture consumes one or more 
additional clock cycles and accordingly reduces the operating speed of the conventional 
VLIW processor. Additionally, transfer of a source operand in this manner results in 
significant additional power consumption in the conventional VLIW processor since a 
"toggling" of potentially all of the bits (e.g. 64 bits) in a move bus might occur in order to 
complete the transfer of the source operand between register file banks. See, for example, 
present application at page 1 1 , and Figure 1 . 

There are advantageously no move buses in the VLIW processor according to 
embodiments of the present invention, as explicitly set forth in the present application and 
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the pending claims. See, for example, present application at page 14 and independent 
claim 1 . In the present embodiments, due to the fact that operands are delivered directly 
from either register file bank to either data path block, the additional clock cycle required 
to move an operand from one register file bank to the other register file bank prior to the 
delivery of die operand to the destination data path block is eliminated. Since operands 
do not go through move buses 170 and 172 (shown in Figure 1 of the present application), 
increased speed is achieved due to the elimination of the additional clock cycle existing in 
a conventional VLIW processor. Moreover, the charging and discharging of these buses 
for the purpose of accomplishing a move is avoided, and as such, a substantial power 
savings is achieved. Thus, by replacing move buses 170 and 172 (Figure 1 of the present 
application) in a conventional VLIW processor with read buses 260, 262, 264, and 266 in 
VLIW processor 200 (Figure 2 of the present application), embodiments of the invention 
achieve increased speed and reduced power without increasing the required chip area. 

In contrast, Van Gageldonk is directed to a retargetable compiler where various 
instruction sets allow the use of particular functional units and register files. As such, 
Van Gageldonk fails to disclose the power and space efficient bus architecture defined by 
amended independent claims 1 , 7, 1 1 and 1 9. For example, as seen in Figure 2 of the 
present application, data path blocks 212 and 214 are configured such that they are only 
able to write data to an adjacent (i.e. a respective) register file bank. Data path block 212, 
for example, can only write to register file bank 252 and not to register file bank 254. 
Furthermore, the bus architecture defined by amended independent claims 1, 7, 1 1 and 19 
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allows other data path blocks, for example data path block 214 in Figure 2, to read 
particular data written to other "remote" registers. Accordingly, data in a particular 
register file bank is equally available to both data path blocks 212 and 214. This feature 
is advantageous because the 32-bit wide write buses 162 and 164 (seen in Figure 1 of the 
present application) that are found in conventional VL1W processors can be eliminated, 
saving considerable chip area. 

Van Gageldonk, however, merely discloses a general diagram (seen in Figure 1) 
that only indicates the availability of register file RFT to both functional unit clusters 
UC1 and UC2. More specifically, Van Gageldonk does not teach the particular 
interconnections that exist between RFT and any physical registers that might be 
contained therein. Therefore, the present invention is patentably distinguishable over Van 
Gageldonk. The Examiner has referred to Van Gageldonk, paragraph 0026, lines 18-22, 
as purportedly disclosing that each unit cluster of Van Gageldonk can access operands 
from either register file RF1 ' (the visible register file) or register file RF1 (the invisible 
register file). However, as disclosed in Van Gageldonk, even if operands were available 
from both RFT and RF1 register files, only during "a complete instruction set for the 
time-critical parallel code" register files other than RFT can be viewed. See, for 
example, paragraph 0026 of Van Gageldonk. Moreover, Van Gageldonk does not even 
specifically mention register file RF1 as one of those other register files that can be 
viewed only during "a complete instruction set for the time-critical parallel code." 
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For the foregoing reasons, Applicants respectfully submit that the present 
invention as defined by amended independent claims 1, 7, 1 1 , and 1 9 is not taught, 
disclosed, or suggested by the art of record. As such, the claims depending from 
amended independent claims 1,7, 11, and 19 are, a fortiori, also patentable for at least the 
reasons presented above and also for additional limitations contained in each dependent 
claim. 

B. Conclusion 

Based on the foregoing reasons, the present invention, as defined by amended 
independent claims 1,7, 11, and 19, and the claims depending therefrom, is patentably 
distinguishable over the art cited by the Examiner. As such, and for all the foregoing 
reasons, an early Notice of Allowance directed to claims 1-3, 6-8, 11-15 and 1 8-20 
remaining in the present application is respectfully requested. 
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Respectfully Submitted, 
FARJAMI & FARJAMI LLP 





Michael Farjami, Esq. 
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